1 |
Shapley Idioms: Analysing BERT Sentence Embeddings for General Idiom Token Identification
|
|
|
|
In: Front Artif Intell (2022)
|
|
BASE
|
|
Show details
|
|
3 |
English WordNet Taxonomic Random Walk Pseudo-Corpora
|
|
|
|
In: Conference papers (2020)
|
|
BASE
|
|
Show details
|
|
4 |
Language related issues for machine translation between closely related south Slavic languages
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Synthetic, Yet Natural: Properties of WordNet Random Walk Corpora and the impact of rare words on embedding performance
|
|
|
|
In: Conference papers (2019)
|
|
Abstract:
Creating word embeddings that reflect semantic relationships encoded in lexical knowledge resources is an open challenge. One approach is to use a random walk over a knowledge graph to generate a pseudo-corpus and use this corpus to train embeddings. However, the effect of the shape of the knowledge graph on the generated pseudo-corpora, and on the resulting word embeddings, has not been studied. To explore this, we use English WordNet, constrained to the taxonomic (tree-like) portion of the graph, as a case study. We investigate the properties of the generated pseudo-corpora, and their impact on the resulting embeddings. We find that the distributions in the psuedo-corpora exhibit properties found in natural corpora, such as Zipf’s and Heaps’ law, and also ob- serve that the proportion of rare words in a pseudo-corpus affects the performance of its embeddings on word similarity.
|
|
Keyword:
Artificial Intelligence and Robotics; Computational Linguistics; corpus; evaluation; Numerical Analysis and Scientific Computing; random walk; representations; Software Engineering; taxonomy; word embeddings; word similarity; WordNet
|
|
URL: https://arrow.tudublin.ie/scschcomcon/271 https://arrow.tudublin.ie/cgi/viewcontent.cgi?article=1283&context=scschcomcon
|
|
BASE
|
|
Hide details
|
|
6 |
Size Matters: The Impact of Training Size in Taxonomically-Enriched Word Embeddings
|
|
|
|
In: Articles (2019)
|
|
BASE
|
|
Show details
|
|
8 |
Quantitative Fine-Grained Human Evaluation of Machine Translation Systems: a Case Study on English to Croatian ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Is it worth it? Budget-related evaluation metrics for model selection ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Quantitative Fine-grained Human Evaluation of Machine Translation Systems: a Case Study on English to Croatian
|
|
|
|
In: Articles (2018)
|
|
BASE
|
|
Show details
|
|
11 |
Is it worth it? Budget-related evaluation metrics for model selection
|
|
|
|
In: Conference papers (2018)
|
|
BASE
|
|
Show details
|
|
12 |
hr500k – A Reference Training Corpus of Croatian.
|
|
|
|
In: Conference papers (2018)
|
|
BASE
|
|
Show details
|
|
17 |
Fine-grained human evaluation of neural versus phrase-based machine translation ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Fine-Grained Human Evaluation of Neural Versus Phrase-Based Machine Translation
|
|
|
|
In: Prague Bulletin of Mathematical Linguistics , Vol 108, Iss 1, Pp 121-132 (2017) (2017)
|
|
BASE
|
|
Show details
|
|
|
|